Erik Kusch, PhD Student
Department of Biology
Section for Ecoinformatics & Biodiversity
Center for Biodiversity Dynamics in a Changing World (BIOCHANGE)
Aarhus University
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 1
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 2
Flow of information between clusters is needed to
correctly assess the whole population of estimates.
Each cluster is a whole new data set to our models
and so doesn’t make use of the information contained
in the other clusters present in the data.
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 3
Pseudo-Replication is a Pseudo-Problem
Basic, fixed
effect model
Multi-Level Model
Make a model
“with memory”
Varying (“random”) intercepts
Average intercept of all tanks
Standard Deviation of average
intercept
This is adaptive
regularisation !
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 4
Simply add priors on
new sub-hierarchy to
the model
specification.
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 5
When adding parameters to a multi-level
model we may obtain a worse fit to training
data, but a better fit to test data.
Shrinkage effect
- Raw mean is average survivorship across all tank clusters
- Population mean is the average survivorship across all tanks weighted by tadpole number in each tank
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 6
Estimates of outcomes are pushed/shrunk towards the population mean.
High Shrinkage
due to small
sample size
Low Shrinkage
due to big
sample size
Sample size =
Number of
tadpoles per tank
in each cluster
High shrinkage
because far
from the
population
mean
Low shrinkage
because close
to the
population
mean
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 7
“Pooling is the process. Shrinkage is the pattern.”
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 8
More data from which to pool
Stronger shrinkage
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 9
“What if clusters in different predictors overlap?”
“Simply add them to the model.”
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 10
Always pulls
the left level in
this range and
higher
More
variatio
n in
actors
than in
blocks
More shrinkage on block-
estimates than on actor-
estimates
Adaptive shrinkage ensures that our model is not
negatively affected by the addition of a cluster type
with little variation therein.
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 11
All MCMC-based algorithms suffer
from this but, in most cases, fail to
provide diagnostics.
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 12
Usually, the model is wrong.
Re-Parameterisation is usually the better of the two options.
Re-Centred
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 13
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 14
09/04/2021
[Study Group] Bayesian Statistics with the Rethinking Material 15
Do not expect the posterior predictive
distribution to match the raw data.
Adaptive shrinkage makes that difficult.
Post-stratification - weighing estimates by samples per cluster can help
with predictions to correct for bias.